Structured grammar-based codes for universal lossless data compression
نویسندگان
چکیده
منابع مشابه
Structured Grammar-based Codes for Universal Lossless Data Compression∗
A grammar-based code losslessly compresses each finite-alphabet data string x by compressing a context-free grammar Gx which represents x in the sense that the language of Gx is {x}. In an earlier paper, we showed that if the grammar Gx is a type of grammar called irreducible grammar for every data string x, then the resulting grammar-based code has maximal redundancy/sample O(log log n/ log n)...
متن کاملUniversal lossless data compression algorithms
4 Improved compression algorithm based on the Burrows–Wheeler transform 61 4.1 Modifications of the basic version of the compression algorithm. 61 5 Conclusions 141 iii Acknowledgements 145 Bibliography 147 Appendices 161 A Silesia corpus 163 B Implementation details 167 C Detailed options of examined compression programs 173 D Illustration of the properties of the weight functions 177 E Detail...
متن کاملFountain codes for lossless data compression
This paper proposes a universal variable-length lossless compression algorithm based on fountain codes. The compressor concatenates the Burrows-Wheeler block sorting transform (BWT) with a fountain encoder, together with the closed-loop iterative doping algorithm. The decompressor uses a Belief Propagation algorithm in conjunction with the iterative doping algorithm and the inverse BWT. Linear-...
متن کاملGrammar-based codes: A new class of universal lossless source codes
We investigate a type of lossless source code called a grammar-based code, which, in response to any input data string over a fixed finite alphabet, selects a context-free grammar representing in the sense that is the unique string belonging to the language generated by . Lossless compression of takes place indirectly via compression of the production rules of the grammar . It is shown that, su...
متن کاملAn Effective Grammar-Based Compression Algorithm for Tree Structured Data
Many semistructured data such as HTML/XML files are represented by rooted trees t such that all children of each internal vertex of t are ordered and all edges of t have labels. Such data is called tree structured data. Analyzing large tree structured data is a time-consuming process in data mining. If we can reduce the size of input data without loss of information, we can speed up such a heav...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Communications in Information and Systems
سال: 2002
ISSN: 1526-7555,2163-4548
DOI: 10.4310/cis.2002.v2.n1.a2